Automatic Training of Page Segmentation Algorithms: An Optimization Approach

نویسندگان

  • Song Mao
  • Tapas Kanungo
چکیده

Most page segmentation algorithms have userspecifiable free parameters. However, algorithm designers typically do not provide a quantitative/rigorous method for choosing values for these parameters. The free parameter values can affect the segmentation result quite drastically and are very dependent on the particular dataset that the algorithm is being used on. In this paper, we present an automatic training method for choosing free parameters of page segmentation algorithms. The automatic training problem is posed as a multivariate non-smooth function optimization problem. An efficient direct search method — simplex method — is used to solve this optimization problem. This training method is then applied to the training of Kise’s page segmentation algorithm. It is found that a set of optimal parameter values and their corresponding performance index can be found using relatively few function evaluations. The UW III dataset was used for conducting our experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Empirical Performance Evaluation Methodology and Its Application to Page Segmentation Algorithms

ÐWhile numerous page segmentation algorithms have been proposed in the literature, there is lack of comparative evaluationÐempirical or theoreticalÐof these algorithms. In the existing performance evaluation methods, two crucial components are usually missing: 1) automatic training of algorithms with free parameters and 2) statistical and error analysis of experimental results. In this paper, w...

متن کامل

A Methodology for Empirical Performance Evaluationof Page Segmentation AlgorithmsSong

Document page segmentation is a crucial preprocessing step in Optical Character Recognition (OCR) systems. While numerous page segmentation algorithms have been proposed , there is relatively less literature on comparative evaluation | empirical or theoretical | of these algorithms. For the existing performance evaluation methods, two crucial components are usually missing: 1) automatic trainin...

متن کامل

A Hybrid Optimization Algorithm for Learning Deep Models

Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...

متن کامل

A Hybrid Optimization Algorithm for Learning Deep Models

Deep learning is one of the subsets of machine learning that is widely used in Artificial Intelligence (AI) field such as natural language processing and machine vision. The learning algorithms require optimization in multiple aspects. Generally, model-based inferences need to solve an optimized problem. In deep learning, the most important problem that can be solved by optimization is neural n...

متن کامل

Reducing Light Change Effects in Automatic Road Detection

Automatic road extraction from aerial images can be very helpful in traffic control and vehicle guidance systems. Most of the road detection approaches are based on image segmentation algorithms. Color-based segmentation is very sensitive to light changes and consequently the change of weather condition affects the recognition rate of road detection systems. In order to reduce the light change ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000